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Abstract 

Using Monte Carlo dynamics and the Monte Carlo 
Histogram Method, the simple three-dimensional 27 
monomer lattice copolymer is examined in depth. 
The thermodynamic properties of various sequences 
are examined contrasting the behavior of good and 
poor folding sequences. The good (fast folding) 
sequences have sharp well-defined thermodynamic 
transitions while the slow folding sequences have 
broad ones. We find two independent transitions: a 
collapse transition to compact states and a folding 
transition from compact states to the native state. 
The collapse transition is second order-like, while 
folding is first order. The system is also studied as 
a function of the energy parameters. In particular, 
as the average energetic drive toward compactness is 
reduced, the two transitions approach each other. At 
zero average drive, collapse and folding occur almost 
simultaneously; i.e., the chain collapses directly into 
the native state. At a specific value of this energy 
drive the folding temperature falls below the glass 
point, indicating that the chain is now trapped in 
local minimum. By varying one parameter in this 
simple model, we obtain a diverse array of behaviors 
which may be useful in understanding the different 
folding properties of various proteins. 

I INTRODUCTION 

Simple models are one powerful theoretical tool for 
the study of complex systems. The idea is to re¬ 
move all but the essentials from the original system 
which will ideally allow for a more complete analysis. 
There is often a trade-off between the complexity of 
the model (or how faithfully it represents the system 
of interest) and the thoroughness of the analysis. In 
the case of protein folding, research has spanned the 
entire spectrum from all-atom molecular dynamics 


with solventErH’QQ to completa enumeration of sim¬ 
ple lattice polymer systemswith many works in 
between these two extremes. Naturally the more re¬ 
alistic simulations do not yield results as thorough as 
the simpler ones. In the all-atoms simulations a large 
amount of supercomputer time is required for runs of 
hundreds of picoseconds of a single protein molecule 
(plus solvent). In contrast, high-end workstations can 
be used to simulate lattice polymers. Many different 
sequences can be simulated over a range of tempera¬ 
tures for time scales comparable to the folding time. 
We do not want to imply that one set of techniques is 
superior than the other, but rather in studying a sys¬ 
tem as complex as proteins many different approaches 
are necessary. In fact the two limits complement each 
other. Simple systems permit detailed analysis while 
the more complex systems allow for contact with real 
proteins. Connecting these two limits would allow for 
a more through analysis of real protein systems. Such 
an analysis has been recently carried ont.luB 

In a previous worki we examined the kinetics of 
a simple three-dimensional lattice polymer system. 
This system is too large for exact enumeration studies 
but is still small enough for detailed analysis. Many 
studies of lattices models seem to focus either on ther¬ 
modynamics or on kinetics, consid^ing each in isola¬ 
tion. However, as previously shownd and shown here, 
a combined approach that considers both the kinetics 
and thermodynamics of the same system is important 
in understanding the model in detail. We have deter¬ 
mined that there is an important relation between 
kinetics (the “glass transition”) and thermodynam¬ 
ics (the folding temperature) in determining whether 
a sequence will fold or not, aii-ij:^ that was advanced 
by Bryngelson and WolynesjEl’til and later explored 
by Leopold and Onuchic.tJ In this work we continue 
the study of this system, examining both thermody¬ 
namics and kinetics in greater detail. 
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Some of the earliest work on the thermodynamics 
of protein-like lattic^joLymers has been performed by 
Chan, Lau and Dill.t^EJ They examined short chains 
in two dimensions for which it is possible to enumer¬ 
ate all conformations. This allowed them to calcu¬ 
late any thermodynamic quantity by simply summing 
over states. By measuring a variety of parameters, 
such as the average compactness, number of contacts 
and hydrophobic core, they found a distinct differ¬ 
ence between folding and non-folding sequences. Al¬ 
though exact enumeration studies are extremely pow¬ 
erful, they are limited to small chains, usually in two 
dimensions. In three dimensions many have stud¬ 
ied the 27 monomer system on a simple cubic lattice 
which has a maximally compact shape of a 3 x 3 x 3 
cube. It is not possible to enumerate all conforma¬ 
tions, but it is still possible to enumerate all compact 
conformations (what is often referred to as the cube 
spectrum) and then calculate approximate thermody¬ 
namics using just the cube states.E3’llj However, as 
we show in this work, care must be taken in using 
this technique since the cube states are not an ac¬ 
curate approximation to the full density of states; in 
particular there are many low energy (i.e., thermody¬ 
namically relevant) states that are not cubes. 

For longer chains, one can use the standard Monte 
Carlo techniquelm for calculating thermodynamic 
properties. Skolnick, Sikorski and Kolinskill3’llj’E2luse 
a dynamic Monte Carlo method to study folding of 
realistic protein-like structures on a diamond lattice. 
The word dynamic here is used to indicated that the 
move set has been selected in such a way that the ac¬ 
tual time course of folding is as realistic as possible. 
Note that this is not necessary for calculating ther¬ 
modynamic quantities, since it is only necessary that 
the moves satisfy defied balance. In fact, O’Toole 
and PanagiotopoulosEil found that using the dynamic 
moves for the 3D cubic lattice system led to sam¬ 
pling problems. At low temperature they observed 
a hysteresis for the average energy. This is probably 
caused by trying to sample b^w the glass transitiom 
This transition was predictedcEl and explicitly showntl 
to exist in these simple lattice systems. O’Toole and 
Panagiotopoulos were able to circumvent this prob¬ 
lem by using a more sophisticated sampling proce¬ 
dure based on thOpHosenbluth and Rosenbluth chain 
growth algorithm^ They studied chains as long as 
48 monomers and have used this technique to study 
therrnajdynamically significant low energy conforma- 
tions.c3 Others have used Monte Carlo techniques to 
examine the effects of various potential functions on 
the kinetics and thermodyinamics of two-dimensional 
lattice polymer systems.Eil 


These previous studies used the standard Monte 
Carlo technique. The simulation is run at a given 
temperature and various averages are computed. To 
obtain thermodynamic quantities for a different tem¬ 
perature, another simulation (at the new tempera¬ 
ture) is performed. However, it is possible to extract 
information about temperatures other than the sim¬ 
ulation temperature using a technicme-often called 
the Monte Carlo Histogram MethodE^^ Using this 
technique one can calculate an approximate density 
of states for the system, which can be used to cal¬ 
culate any thermodynamic quantity of interest over 
a range of temperatures. Because of the small sys¬ 
tem sizes used, we can obtain accurate results over a 
broad range of temperatures. In particular, we can 
extrapolate into the glass region where running nor¬ 
mal simulations becomes extremely difficult and time 
consuming. The technique also facilitates the finding 
of peaks or zeros of various thermodynamic functions. 
It can be used to calculate extensive quantities like 
the free energy or entropy of the system, which are 
difficult to extract by the standard Monte Carlo pro¬ 
cedure. One can not only vary the temperature, but 
also the various parameters in the potential. One can 
examine how the system behaves thermodynamically 
at a range of parameter values without the need to 
run new simulations. This in-depth analysis of the 
thermodynamics has enabled us, along with several 
others, to begin to connect the behavior and proper¬ 
ties of tl^^ simple model systems with those of real 
proteins .LrB 

II MODEL AND METHODS 

A Definition of model and dynamics 

The model studied was a simple three-dimensional 
cubic lattice polymer. All chains were 27 monomers 
long with two different monomer types. The poten¬ 
tial was a contact interaction between monomers that 
are nearest neighbors on the lattice (but that are not 
linked covalently). The energy of a contact was Ei for 
a pair of the same monomer type and for a pair 
of different monomers. This model was the same one 
used in our previous studyO (which can be consulted 
for a more detailed explanation of the model). The 
energy function is: 

E = NiEi + NuEu, (1) 

where Ni is the number of contacts between 
monomers of the same type (like contacts) and 
the number of contacts between monomers of differ¬ 
ent types. 


Socci & Onuchic 


Kinetic and thermodynamic analysis of proteinlike heteropolymers 


3 


Comer Move 




Crankshaft Moves 


End Moves 




Figure 1: The three types of moves used in the sim¬ 
ulations. The light circles represent the possible lat¬ 
tice points to which a given monomer can move, pro¬ 
vided that that point is not occupied. In the case 
of the end and crankshaft moves, one of the possible 
moves is picked at random. Note that the corner and 
crankshaft moves are exclusive: A non-end monomer 
can only make one or the other depending on the 
position of its neighbors along the chain. 


Although it is easiest to express the energy function 
in terms of the Ei and variables, some properties 
of the model are more clearly understood by consid¬ 
ering an equivalent set of parameters, E^vg and A, 
defined as follows: 

Eavg = — {El + Ey) 

A = {Ey-Ei). (2) 

Aavg represents the overall drive toward forming con¬ 
tacts or compacting the chain. If it is less than zero, 
contact formation will be favored. A determines the 
heterogeneity of the heteropolymer. In the limit that 
A = 0 the mc^el becomes a homopolymer. In our 
previous work,EI ifavg = — 2 and A = ^giving values 
-3 and -1 for Ei and Ey respectively.cJ This insures 
that the chain collapses rapidly, compared to fold¬ 
ing, and that the minimum energy state is a max¬ 
imally compact cube for the sequences considered 
here. When we say the chain has folded we mean it 
is in the native state. This is distinguished from col¬ 
lapse, which refers to chains that can be in any com¬ 
pact conformation. The same parameters were used 
for part of the results presented here. In addition we 
show how the model behaves as these parameters are 
varied. 

The move set is shown in figure and consists of 
local one- or two-monomer moves. These moves are 
the standard set used in lattice polymer simulations. 
They are believed to produce reasonably realistic dy¬ 
namics (see references |^,^for details). For 


thermodynamics calculations, it is not necessary to 
use a move set with-this property, and other move sets 
have been used.EirLJ There are however, two potential 
problems with realistic dynamic move sets: ergodic- 
ity and glassy behavior. If one is interested only in 
kinetics, then these are not really problems but rather 
properties of the model. For thermodynamic calcula¬ 
tions, inaccessibility is not a problem either, since we 
can consider the definition of the model to include 
only the states accessible by the move set specified 
(hence it will be by definition ergodic). Therefore, as 
long as the minimum energy cube state is accessible 
from an unfolded chain, there will be no problems. 

The glass transition presents a more difficult prob¬ 
lem for thermodynamic calculations. At low temper¬ 
atures, the dynamics of the system slow down sub¬ 
stantially. In particular, the correlation times be¬ 
come quite large and it takes longer to explore con¬ 
formational space. This gives rise to two errors in 
the Monte Carlo calculations. First, it takes a long 
time for the system to relax, making it difficult to 
get the system into thermal equilibrium. This con¬ 
tributes a systematic error to the results and it can 
give rise_tp_the hysteresis effect seen in the previous 
studies.Eilo Second, because of the long autocorrela¬ 
tion times subsequent samples are no longer statisti¬ 
cally independent of each other. This has the effect 
of increasing the variance of any observable (and the 
corresponding statistical error). The actual variance 
is given by: 


^Luai = (l + 2:^)a2 (3) 

'samp 

where is the usual variance calculated from the 
samples, t^c is the integrated autocorrelation time 
and Tsamp is the number of time steps between 
sampl^ measured in units of Monte Carlo 

steps)As increases, longer simulations are 
necessary to get statistically reasonable results. That 
is, in a simulation of N steps there will be A/rac 
“effectively independent samples.At low temper¬ 
atures the time required to obtain enough indepen¬ 
dent samples becomes enormous. One solution is to 
use a different moAp^|S^,pSiich as the Rosenbluth chain 
growth algorithmEj’EirLJ The auto-correlation time 
for this move set does not increase as rapidly, al¬ 
lowing simulations at lower temperatures. Our so¬ 
lution instead is to run the simulations well above 
the glass transition temperature, and then to use the 
histogram method to extrapolate to lower tempera¬ 
tures. 
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B Techniques for calculating thermodynamics 
quantities 


For short enough chains (usually in two-dimensions) 
it is possible to enumerate all lattice conformations 
and thereby calculate the partition function for the 
system along with any other quantity of interest 
For the thr^dimensional 27 monomer chain, it is 
not practical to enumerate all conformations but it 
is possible to enumerate all of the maximally compact 
(cube) conformations. One could then approximate 
the partition function by just summing over the cube 
states. Previous studies have used this method to 
calculate thermodynamics quantities of this system. 
It was hoped that at low temperatures this approx¬ 
imation might be reasonable. We show later that, 
although it can give rough qualitative results, using 
only the cube states leads to appreciable errors. 

Since exact enumeration is impossible, Monte 
Carlo sampling is used for calculating thermody¬ 
namic quantities. The usual technique runs the sim¬ 
ulation at a given temperature, collecting samples to 
determine thermal averages. The process is repeated 
for several temperatures to get the averages as a func¬ 
tion of temperature. There are several drawbacks to 
the standard Monte Carlo procedure. Calculating ex¬ 
tensive variables like the free-energy or entropy is dif¬ 
ficult. Also, if one wants to find peaks or zeros of a 
given quantity (like the specific heat peak, to iden¬ 
tify the transition temperature), one must scan over 
a range of temperatures to located the critical value. 

It is possible to extract more information from a 
single Monte Carlo run than just the thermal averages 
at the temperature the simulation was performed. 
The technique is called the histogram method or den¬ 
sity of states method. It has a long history and it 
has b|ee^r|ecmtlv ^discovered by a variety of au- 
thors.Ej’Ej’ESEZloEj The actual Monte Carlo sam¬ 
pling algorithm itself is unchanged. But instead of 
just calculating thermal averages, one keeps track of 
the number of times a specific energy is encountered 
in the simulation; i.e., an energy histogram is calcu¬ 
lated. This histogram, h{E,T), measures the proba¬ 
bility of energy E occuring at temperature T. It is 
equal to the thermal average of the density of states: 


h{E,r) 


n{E)e 

Z{T’) 


( 4 ) 


where Z {T') is the partition function at temperature 
T' is 


Z(T') = ^n(F;)e-^/^', (5) 

E 


and n{E) is the density of states for energy E (the 
number of conformations with energy E). The Boltz¬ 
mann factor kh has been set equal to I and T' is the 
temperature of the simulation. One now has the den¬ 
sity of states up to a multiplicative factor: 

n{E) = h{E,T')e^^^' Z{T'), (6) 


where Z{T') is the unknown multiplicative constant. 
For intensive quantities, thermal averages are calcu¬ 
lated using: 


{O) (T) 


Y.^O{E)n{E)e-^/^ 

j:EO{E)h{E,r)e-^/^+^/^' 


Note, Z{T') cancels out of the above expression. 

If one is interested in calculating extensive quan¬ 
tities like the free energy or the entropy, iOrecomes 
necessary to determine the constant Z(T').a For our 
system, it is possible to calculate this constant and 
therefore obtain the density of states. To determine 
this constant, all we need is the multiplicity of any 
energy state. For example, the sequences we study 
have a non-degenerate ground state. This means 
n{Egs) = I, where Egs is the energy of the lowest 
energy cube. Z{T') can then be determined and the 
free energy is then calculated using E = —TlogZ 
and equation 

There is a limit to the temperature range over 
which equation is valid. For temperatures too far 
from the simulation temperature, the errors in the 
density of states calculated from equation ^ become 
significant. At any given temperature, the sy^m is 
only sampling a given region of phase space.^ For 
example, figure || shows energy histograms at a high 
and a low temperatures. For the high T simulations, 
the ground state is never probed, and likewise for 
the low T some high energy states are never reached. 
Consequently, the density of states will be incorrect 
for regions not sampled properly (in fact it equals 
zero for regions that are never sampled). This limits 
the temperature range we can extrapolate from any 
simulation. One needs to monitor the errors in the 
density of states. A soluticp,to this problem is the 
multiple histogram method.E^ The idea is to use sev¬ 
eral different simulations and patch the histograms 
together. Although there are some subtleties to this 
technique, it can be powerful. 

For the 27 monomer long polymer used here, a sin¬ 
gle histogram gives adequate results over the range 
of temperatures of interest. The reason is that the 
width of the energy histograms in general scale as 
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\/'/N, where N is the system size. As we shall show 
shortly, the system is small enough to insure that 
the histograms are broad and a large region of phase 
space is sampled at any given temperature. 

Ill RESULTS AND DISCUSSION 

Several Monte Carlo runs were performed over a 
range of temperatures. Six sequences were used all 
with a hxed ratio of monomer types (14 to 13). Ta¬ 
ble 1 ^ lists the sequences and the energy of their na¬ 
tive states. To get some idea of what a typical (i.e. 
randomly chosen) sequence is and how it compares 
to the sequences used in this study, we generated 
over 10,000 random sequences (with a 14:13 ratio of 
monomer types). Approximately 1,000 sequences had 
unique ground states. Figure |3| shows a histogram of 
the ground state energies for these sequences. The 
distribution is roughly gaussian. The most probably 
energy is approximately -76. One sequence examined 
(sequence 013) has a typical ground state energy for 
random sequences. We did not generate any of the 
minimum energy (-84) sequences at random. The 
two that we used wher^both designed: one using a 
Monte Carlo algorithms and the other by mutating 
a -82 energy sequence. The thermodynamic averages 
of several quantities (average energy, contacts, na¬ 
tive contacts, specific heat, etc.) were calculated at 



Figure 2: Energy histograms taken at high (T = 3.15) 
and low (T = 1.41) temperatures. Note that for each 
histogram there are regions of energy that are not 
sampled at all. In particular at high temperatures 
the ground state is not probed while at low temper¬ 
atures the high energy (unfolded) conformations are 
not probed. 


Run 

Sequence 

-£^min 

002 

ABABBBBBABBABABAAABBAAAAAAB 

-84 

004 

AABAABAABBABAAABABBABABABBB 

-84 

005 

AABAABAABBABBAABABBABABABBB 

-82 

006 

AABABBABAABBABAAAABABAABBBB 

-80 

007 

ABBABBABABABAABABABABBBABAA 

-80 

013 

ABBBABBABAABBBAAABBABAABABA 

-76 


Table 1: The various sequences used in this pa¬ 
per. The last four (005, 006, 007, 013) were gen¬ 
erated at random. Sequence 002 was optimized by 
Shakhnovich (Ref. Sequence 004 is a single 

monomer mutation of 005 (R 13 ^ A). Both 002 and 
004 have the lowest energies possible for the poten¬ 
tial used and have native states that are completely 
unfrustrated. Emin is the energy of the native states. 
These same six sequences were studied in our pre¬ 
vious work (Ref. ^ which examined the kinetics of 
folding. 

each of the simulation temperatures. In addition, his¬ 
tograms of the number of like versus unlike contacts 
{Ni and N^) were calculated. We chose to histogram 
these variables instead of the energy directly because 
these histograms can be used to extrapolate not only 
other temperatures but other parameter values {Ei 
and E„). 

Figure ^ shows the folding time and collapse times 
as a function of the inverse temperature. The two 
different collapse times are the time to find the first 
cube (i.e., a maximally compact state) and the time 
to form the first 25 (out of 28) contacts. For high 
temperatures, the collapse times are sequence inde¬ 
pendent (self-averaging). The folding times are se¬ 
quence dependent. Similarly, the collapse times be¬ 
low the glass transition are also sequence dependent. 
The kinetic glass trai^ition temperature was defined 
in our previous worUj as the temperature at which 
the folding time is half way between its minimum and 
maximum value (the maximum being determined by 
the time limit on the simulation and chosen to be 
much longer than the fastest folding time). Both the 
folding and collapse time show non-Arrhenius behav¬ 
ior at high temperatures. 

A Histograms: First vs. second order 
transitions 

Before using the histograms to calculate the density 
of states, we examined them to determine how much 
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Figure 3: Ground state energy histogram for non¬ 
degenerate sequences. 13,563 sequences with a fix 
ratio of monomer types (14:13) were generated at 
random. The energy and multiplicity of the mini¬ 
mum energy cube was determined from exhaustive 
enumeration of the cube conformations. 1,061 se¬ 
quences (7.8%) were found to have unique ground 
states. The histogram of ground state energies from 
these sequences is plotted (solid circles). The dotted 
line is a lest squares fit of a gaussian to the data. 
Note, we did not find any minimal energy (-84) se¬ 
quences in our random sample. The total number of 
possible sequences is = 20,058,300. 


Figure 4: Folding and collapse times versus inverse 
temperature. Time is in Monte Carlo steps. The 
solid lines represent the mean folding time. The mid¬ 
dle dotted lines represent the mean compaction time 
to any cube. The bottom lines represent the mean 
compaction time to a partially compact conforma¬ 
tion with 25 (out of 28) contacts. Error bars are the 
standard deviation of the mean. Note, simulations 
were run for a set amount of time Tmax- For high and 
low temperatures some runs were not able to find the 
folded (or compact) state in this time. In these cases 
Tmax was averaged in, so the times shown are lower 
bounds to the actual mean first-passage times. 


of phase space is sampled at different temperatures. 
This (as mentioned above) determines how far we can 
extrapolate from the simulation temperature using 
the histogram technique. Figure ^ shows the energy 
histogram for sequence 002 for several temperatures. 
Because of the small size of our system, they are all 
rather broad, with widths of roughly 30 energy units. 
Examining the behavior of the curves as a function of 
temperature, we see the both first and second order¬ 
like behavior of the system. At high temperatures 
(between 5 and 2, see fig. la) a single energy peak 
moves steadily to lower values. This is what would 
be expected from a second order-like transition. At 
lower temperatures (fig. |b) the plots now have a 
bimodal distribution and as the temperature is de¬ 
creased there is a shift from one peak to the other. 
This is characteristic of a first order-like transition. If 
we examine the histograms as a function of the num¬ 
ber of contacts, the same behavior is seen (fig. ^). At 
high temperatures the plots are unimodal and shift 
to larger number of contacts (higher compactness) 
as the temperature is lowered. This continues un¬ 


til the maximum reaches roughly 20 contacts around 
T = 2. At lower temperatures this peak remains fixed 
at about 20 contacts and another peak forms at 28 
contacts. Because this peak at 28 contacts occurs at 
low temperatures and when there is a peak at -84 in 
the energy histograms, we expect that it is due to 
occupation of the native state and the few other low 
energy cubes. As the the temperature is decreased 
there is a shift in population between the two peaks. 
This is consistent with the idea that there are two 
thermodynamic transitions: a collapse to compact 
structures and a folding transition to the native state. 
The collapse transition occurs at a higher tempera¬ 
ture and is second order-like. The folding transition 
is first order-like. 

Since the histograms are broad enough we used the 
single histogram technique in the subsequent calcula¬ 
tions. Because we are interested mainly in the prop¬ 
erties of the ground state, we chose a temperature 
which is low enough for the ground state to be suf¬ 
ficiently populated and yet high enough so that we 
sample as much of the conformation space as pos- 
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Figure 5: Histograms of energy for several tempera¬ 
tures. The sequence is number 002. For high tem¬ 
peratures (a) the plots have one peak which moves to 
lower energies as the temperature decreases. For low 
temperatures (b) the plots are bimodal. 
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Figure 6: Histograms of contacts for several temper¬ 
atures, again for sequence 002. The behavior of the 
plots is similar to that of the energy histograms (see 
fig. As the temperature is lowered a single peak 
moves from a low to a high number of contacts un¬ 
til it reaches roughly 20 (a). At this point a second 
peak forms at 28 contacts and we see a first order-like 
transition between the two (b). 
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sible. At a temperature of 1.58 there is a sizeable 
peak at the ground state and substantial sampling 
of the higher energy conformations. Kinetically it 
turns out that T = 1.58 the chains fold most rapidly 
(i.e., the mean first-passage time to the folded state 
is smallest). So we expect at this temperature we are 
moving most rapidly through the compact conforma¬ 
tions. Since this temperature is far enough away from 
the kinetic glass temperature (which was previously 
measured to be approximately 1 for this system) we 
do not have to worry about the problem of long re¬ 
laxation times which would make it difficult to equi¬ 
librate and would require a long sampling time to 
reduce statistical errors. We calculated the energy 
autocorrelation function: 


CeU) = 


{ErEr+t) - {Ef 

(A2) - {Ef 


( 8 ) 


where Et is the energy of the system at time step r. 
At long times this function should have the following 
form: 

CE{t) - (9) 

where t^c is the autocorrelation time.H At the tem¬ 
perature T = 1.58 we get an autocorrelation time of 
roughly 500,000 Monte Carlo steps. For all our ther¬ 
modynamic simulations we equilibrated our system 
for 20Tac and ran for 1.08 x 10® steps (2000 times Tac 
which gives roughly 2000 independent samples). 


B Density of states 

Using the histograms from the Monte Carlo simu¬ 
lations and equation ^ the density of states for the 
various sequences is calculated. Figure ^ shows the 
densities for three sequences with low, medium and 
high values for Amin (002, 006, 013). For comparison 
the cube spectrum, determined by exact enumera¬ 
tion, has also been plotted. At low energies the the 
density of states is different for each sequences. In 
particular, sequence 002 has more low energy confor¬ 
mations that are not cubes as compared to 006 and 
013. All three sequences have a notch in their den¬ 
sity, but the gap between the notch and the folded 
state is larger for sequence 002 than for the others. 

At energies below —60 the densities of states for 
the three sequences are roughly the same. This part 
of the spectrum is self-averaging as expected since it 
should depend on only the ratio of monomer types. 
At very high energy there is considerable scatter 
in the plots. This is due to the poor sampling of 
this area of conformational space. In particular, the 




Figure 7: Density of states versus energy calculated 
using the Monte Carlo histogram technique. Three 
different sequences are shown (002, 006, 013) with dif¬ 
ferent ground state energies. The first plot (a) is the 
full density of states. The lines are from the Monte 
Carlo calculation. The points show the density of 
states for just the cube conformations (determined by 
exact enumeration). The second plot (b) is a blow-up 
showing the low energy region of the first. At high 
energies the densities of states are sequence indepen¬ 
dent while at low energies they are strongly sequence 
dependent. 


















Socci & Onuchic 


Kinetic and thermodynamic analysis of proteinlike heteropolymers 


9 



Figure 8: Comparison of the cube spectrum from ex¬ 
act enumeration and from the Monte Carlo histogram 
calculation. Error bars are the standard error of the 
mean from several Monte Carlo runs. The tempera¬ 
ture of the simulations was 1.58. At this temperature 
the average energy is —55.5 and the percent of pop¬ 
ulation in the ground state is 0.2%. 

curves in figure ^ for sequence 002 and 006 do not 
even extend to zero energy, indicating that these con¬ 
formations are not sampled at all. However, we ex¬ 
pect that for low temperature thermodynamic calcu¬ 
lations this will not pose problems. 

As a simple check of the accuracy of the Monte 
Carlo histogram technique in this system, we com¬ 
pare the exact cube spectrum (from enumeration of 
all cubes) to the cube spectrum calculated from the 
histogram data. Remember that there is an unknown 
normalization factor, which we determined by setting 
the density of states for the lowest energy cube equal 
to 1. Figure ^ shows the comparisons. For cubes 
up to an energy of —52 there is excellent agreement 
between the histogram calculation and the exact an¬ 
swer. At high energies we see the same sampling 
problem; cubes with energy greater than —50 are not 
sampled at all, since they make up a negligible frac¬ 
tion of the conformations at these energies. However, 
for low temperature calculations the errors should be 
negligible. 

C Computing thermodynamic quantities 

Figure ^ is a plot of the average energy as a function 
of temperature for the same three sequences (002, 
006 and 013) whose densities of states are shown in 


Figure 9: Average energy versus temperature for 
three different sequences (002, 006 and 013). The 
lines are calculated using the histogram technique 
from simulations at T = 1.58; the points are from 
normal Monte Carlo simulations (i.e., they were cal¬ 
culated from the usual averaging technique at several 
different temperatures). For most temperatures there 
is excellent agreement between the two. As we ap¬ 
proach the glass temperature the normal Monte Carlo 
technique starts to deviate due to the divergence of 
the relaxation (equilibration) time of the system. 

figure ^ At high temperatures (T > 2.5) all their 
sequences have the roughly the same average energy. 
At lower temperatures the sequences are no longer 
self-averaging. The two sequences with the higher en¬ 
ergy folded states (006 and 013) have a fairly broad 
transition while the low energy sequences (002) have 
a comparatively sharper transition. A similar result 
is seen in the specific heat, which is plotted in fig¬ 
ure 1^. Sequence 002 has a much sharper and higher 
specific heat peak which occurs at a higher temper¬ 
ature. The other sequences have broader smaller 
peaks. The peak in the specific heat occurs at tem¬ 
peratures slightly higher than the folding tempera¬ 
ture (see tab. H) and indicates the transition from the 
unfolded chain to the collapsed state rather than the 
transition to the native state. At high temperatures 
the specific heat is sequence independent. 

Also included in figures and ^ are data points 
calculated using the standard Monte Carlo averaging 
technique for comparison. There is excellent agree¬ 
ment between the histogram curves and the points 
up until the kinetic glass temperature (which is at 
T « 1 for these sequences). At the glass temperature 
the usual Monte Carlo technique has a problem with 
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Figure 10: Specific heat versus temperature for se¬ 
quences 002, 006 and 013. The lines are calculated 
using the histogram technique from simulations at 
T = 1.58; the points are from normal Monte Carlo 
simulations (i.e., they were calculated from the usual 
averaging technique at several temperatures). At the 
kinetic glass temperature (T = 1) there is substantial 
error in the normal Monte Carlo result. 

the increasing autocorrelation time (all simulations 
were equilibrate and sampled for the same amount 
of time). The histogram method, however, allows us 
to probe beyond the glass temperature since the low 
energy states can be sampled accurately at higher 
temperatures (see fig. ||) . 

One useful feature of the histogram technique is the 
ability to determine extrema and zeros of thermody¬ 
namic functions. For example, the folding tempera¬ 
ture Tf is the temperature at which the population 
of the native state equals one half: 


PnATf) = 


g Ena.t/'Tf 


( 10 ) 


Once the density of states has been determined, one 
can numerically solise for Tj using any standard root¬ 
finding algorithm.^ Figure plots Pnat(T) for the 
three sequences along with the folding temperatures. 
Also shown is a plot of the probability of being semi¬ 
compact (which we define as structures have 20 or 
more contacts): 


Pc.o{T) = 


E E n{E,C)e-^l^ 

E 020 


( 11 ) 




Figure 11: Folding and collapse transitions for se¬ 
quences 002, 006 and 013. Figure (a) is a plot of 
the probability to be in the native state (Pnat) versus 
temperature, while figure (b) plots the probability 
to be compact (20 out of 28 contacts, Peso)- The 
folding temperature is defined as the temperature at 
which P„at = 5 . The collapse temperature is defined 
similarly. The folding temperature is a sequence- 
dependent quantity while the collapse temperature is 
roughly sequence-independent (self-averaging). We 
expect the collapse transition to depend on the ra¬ 
tio of monomer types (i.e., the over-all drive to com¬ 
pactness) and therefore it should not depend on the 
specific sequence. 


where n{E, C) is the density of states as a function of 
energy and contacts. Note that in order to compute 
this quantity we need to keep track of histograms as 
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Run 

Full DOS 

Cubes 

Percent Error 

002 

1.29(2) 

1.763 

37.2% 

004 

1.26(1) 

1.695 

34.1% 

005 

1.15(2) 

1.429 

24.2% 

006 

0.94(6) 

1.049 

11.5% 

013 

0.83(5) 

0.935 

12 .6% 


These states will reduce the stability of the native 
state. The cube approximation is better for the non¬ 
folding high energy sequences than for the low energy 
sequences. Consequently, it fails more seriously for 
the sequence we are most interested in, namely the 
good folding sequences. 


Table 2: Comparison of the folding temperature Tj 
calculated using the full density of states (from the 
histogram method) and just the cube states (from 
exact enumeration). Numbers in parentheses are the 
uncertainty in the last digit. The last column is the 
percent error of the cube-only calculation. 


a function of energy and contacts. Histograms of just 
the energy would not have allowed us to sort out the 
compact states from the non-compact ones. Similar 
to Tf the compaction temperature Tc occurs when 
the probability of being semi-compact (20 contacts) 
equals one half. Figure |l^ shows the Pcao curves along 
with the values for Tc- We chose 20 contacts because 
that was the point at which the histograms (see fig. ^) 
changed from their second order to first order-like 
behavior. Therefore, we expect that measuring this 
quantity will probe the hrst transition from random 
coil to globule (semi-compact) states. 

As expected the folding temperature [Tf) is se¬ 
quence dependent. Also, the transition curves for 
folding are much sharper than they for collapse. The 
lower the energy of the ground state the higher the 
folding temperature for that sequence. In contrast, 
the compaction temperature [T^) is almost sequence 
independent (it varies by only 4% versus a 43% differ¬ 
ence for Tf). One would expect the compaction tem¬ 
perature to be self-averaging since it should depend 
on the average composition of the sequence (which 
is the same for all sequences used in this work). At 
the compaction temperature the native state occupa¬ 
tion (Pnat) is very small. This is consistent with the 
previous observation from the histograms (see figs. 
and There are two separate thermodynamic tran¬ 
sition: collapse from a random coil and then folding 
to the native state. 

It is clear from figure ^ that using just the cubes 
to calculate thermodynamics can lead to potentially 
large errors. Table || compares the folding temper¬ 
ature calculated using the full density of states with 
the temperature calculated solely on the basis of cube 
states. The cube results consistently over-estimate 
the folding temperature. This is not surprising since 
many low energy non-cube states are being neglected. 


D Combining kinetics and 
thermodynamics—unfolding time 


Much of the work (both experimental and theoreti¬ 
cal) on protein folding deals with the forward process 
(unfolded to folded). However, studying the reverse 
process, the unfolding of nascent proteins, may not 
only provide a wealth of information, but may also 
be a great deal easier. In unfolding simulations, the 
initial condition is well-defined (the folded state) and 
by varying the various parameters it is fairly easy to 
induce unfolding. There have been several works ex¬ 
amining unfplr^g using detailed molecular dynamics 
simulations ej 

Using the data previously calculated, we can com¬ 
pute an unfolding time for our lattice chains. We hrst 
make the two-state assumption, namely that there is 
an unfolded state (U) that is in thermal equilibrium 
with the folded state (F): 


U. 


( 12 ) 


We can then calculate an unfolding time (r^) as fol¬ 
lows: 

Tu[T)=^^^rf{T) (13) 

where [U], [F] are the populations of the unfolded 
and folded state respectively and Tf{T) is the folding 
time as a function of temperature. The ratio [F] / [U] 
is given by P„at(r)/(1 - Pnat(T)), using Pnat('r) de- 
hned by eq. O. Figure 12 shows a plot of the un¬ 


folding time versus 1/T for several sequences. Unlike 
the folding time (hg. U), the unfolding time has a 
much simpler behavior. For temperatures above the 
kinetic glass temperature (Tg), the unfolding times 
vary almost linearly with 1/T and have slightly dif¬ 
ferent slopes for the various sequences. The slopes 
are roughly proportional to Tf, sequences 004 and 
002 have the steepest slopes and 013 has the shallow¬ 
est. In contrast to the folding time, which shows clear 
non-Arrhenius behavior over this temperature range, 
the unfolding time is nearly Arrhenius. We expect 
due to the large enthalpic barrier that entropic ef¬ 
fects are less noticeable in unfolding than folding. At 
low temperatures the unfolding time rolls off due to 
the cutoff in the simulation time (this is the same 
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Figure 12: Unfolding times versus 1/T for several se¬ 
quences. The time is plotted on a log scale. The 
linear relationship between time and 1/T for tem¬ 
peratures greater than Tg is much simpler than the 
behavior of the folding time. 


roll-off seen in the folding time at low temperatures; 
see fig. 


E Exploring parameter space 


All the results so far have been for simulations with 
contact energies Tavg = — 2 and A = 2 (Ei = — 3 
and Tu = —1). These values were chosen to in¬ 
sure that the ground state (native state) would be 
a cube. For sequences like 002 and 004 there exists 
a cube (out of the 103,346 possibilities) that has no 
weak contacts (contacts between different monomer 
types). This cube will be the ground state as long as 
El = (Tavg — A/2) < 0, irrespective of what is 
(i.e., Eu can be greater than zero). We call these se¬ 
quences unfrustrated. All of the other sequences have 
at least one weak contact between unlike monomers 
even in their lowest energy cube (see fig. |^. These 
are frustrated sequences. For frustrated sequences, it 
is not clear that the minimum-energy cube will be the 
minimum-energy conformation. There may be some 
non-cube conformation with fewer total contacts but 
more good contacts than the cube conformation. For 
sequence 005, which has 27 good contacts in its min¬ 
imum cube conformation, and 006 and 007, which 
have 26 good contacts, there are no other conforma¬ 
tions that have more good contacts. Consequently, 
the cube conformations will be the ground states as 
long as Eu = (Aavg -k A/2) < 0.^ For sequence 013, 



Figure 13: The native conformations of sequence 002 
(left) and 006 (right). Note that 006 has 2 “weak” 
contacts (indicated in the figure with dotted lines) in 
its lowest energy state. 


which has 24 good contacts in its minimum-energy 
cube, it is possible that there is some non-cube con¬ 
formation with more than 24 good contacts. We can 
not exhaustively enumerate all the non-cube confor¬ 
mations but we can state empirically that no confor¬ 
mation with a lower energy was found in any of the 
Monte Carlo runs (see note 46). 


We now explore how the model behaves as we vary 
the potential parameters. Specifically, how do the 
various kinetic and thermodynamic properties de¬ 
pend on these parameters? Sequence 002 was ex¬ 
amined in detail. As mentioned, this sequence has 
an unfrustrated cube conformation (see fig This 
cube will be the ground state whenever Ei < 0. If T; 
is greater than zero then the minimum energy con¬ 
formation is the completely unfolded chain with no 
contacts. This clearly would not represent a protein 
under folding conditions; thus we ignore this region 
of parameter space. Several simulations with various 
valueg£3 of E^/ A ranging from 0 to approximately 
2.5 were run.a Figure shows a plot of the fold¬ 
ing and compaction times versus Tavg/A. The times 
plotted in these figures are taken from the tempera¬ 
ture with the fastest time (i.e., the minimum point 
in figure ^ for each value of E^vg/A. As the abso¬ 
lute value of Tavg/A is decreased from 2.5 to 0, both 
compaction times (the time to form 25 and 28 con¬ 
tacts) increase, although the change is small. This is 
expected since the drive to form contacts decreases 
as |Tavg/A| is reduced. For the folding time there 
is an opposite and more dramatic effect. Decreas¬ 
ing iTavg/Aj decreases the folding time (r/), almost 
two orders of magnitude. Also, at Tavg = 0, Tf 
almost equals T 2 s (the time to make 28 contacts). 
What is happening is that as |Tavg/A| decreases, we 
are destabilizing non-native cubes relative to the na¬ 
tive state. At large |Tavg/A|, these low energy non¬ 
native cubes behave as traps slowing down the fold- 


























Socci & Onuchic 


Kinetic and thermodynamic analysis of proteinlike heteropolymers 


13 



0.0 0.5 1.0 1.5 2.0 2.5 3.0 


Figure 14: The minimum folding time and the com¬ 
paction times plotted as a function of iHavg/A for se¬ 
quence 002. The line with the triangles is the mean 
folding time (i.e., the time to find the native state). 
The squares are the mean time to find the first cube 
state (not necessarily the native state) and the di¬ 
amonds are the mean time to find a conformation 
with 25 contacts. For each of these times, there is a 
minimum point in the point of time vs temperature 
(see fig. IH). It is that minimum (fastest) time that is 
plotted here for each value of Flavg/A. 

ing rate. Reducing iFlavg/Aj increases their energy 
relative to the native cube eliminating them as traps. 
This in turn increases the folding rate. Alternatively, 
as Aavg/A ^ Q, Eu increases until it becomes posi¬ 
tive. At this point making weak (incorrect) contacts 
is unfavored relative to breaking them. This keeps 
the chain from forming these weak, incorrect contacts 
which would trap it in states different from the na¬ 
tive state. The chain is more effectively tunneled into 
the native conformation; i.e., the first cube made is 
almost always the native one. By varying illavg/A we 
can control the time-scale separation between folding 
and collapse to maximally compact states. 

Next, we examined how the folding temperature 
{Tf) and the kinetic glass temperature {Tg) varied as 
a function of Aavg/A. In particular, for which val¬ 
ues of £^avg/A is T/ greater than Tgl For these val¬ 
ues of Flavg/A, the chain will fold before the native 
state becomes inaccessible. The histogram method 
is used to calculate Tj for various values of Aavg/A. 
The technique is the same as the one used to ex¬ 
trapolate to different temperatures. In this case one 
needs histograms as a function of good and weak 
contacts (which can be calculated from the energy- 





Figure 15: A plot of the folding temperature (Tf) 
and the kinetic glass temperature (Tg) as a function 
of the average drive to compactness (i?avg)- Both 
the temperatures and the energies have been scaled 
by A. The dotted line is a linear fit of the glass 
temperatures. The solid line is not a fit to the Tf 
points but was calculated explicitly via the histogram 
technique. For values of i^avg/A approximately less 
than 1.7 Tf > Tg so the chains will fold before hitting 
the glass transition. For Aavg/A greater than 1.7 the 
glass transition occurs before folding so the chains 
do not reach their ground state within the simulation 
time. 

contact histograms previously used). To calculate Tg 
we must run simulations at various values of Aavg/A 
since there is no way to extrapolate as in the case of 
Tf. The two temperatures {Tf and Tg) are plotted 
for sequence 002 in figure |l^. Note that we plot the 
temperature normalized by A (i.e., in units of A) just 
as we plot Aavg normalized by A.^ The glass tem¬ 
perature varies almost linearly with iilavg/A. Previ¬ 
ous work on glass transitions in heteropolymers have 
shown that .the transition occurs after the collapse of 
the system.BEJ In fact it was shown that the poly¬ 
mer needs to collapse in order to have a glass transi¬ 
tion. This is consistent with the behavior we see: as 
|i?avg/A| increase so does the collapse temperature 
(Tc). In fact as we will see shortly T^ > Tg for all 
values of iilavg/A. As |i?avg/A| increases, the depth 
of the local minima increases relative to the unfolded 
state. It becomes harder to escape from local traps 
so the chain “freezes” at higher temperatures. 

The folding temperature also increases as the ab¬ 
solute value of iilavg/A increases but reaches a lim¬ 
iting value. This behavior of Tf with Flavg is easy 
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to understand. As |Aavg/A| is increased, the stabil¬ 
ity gap becomes larger; hence T/ increases. However, 
it eventually asymptotes to a limiting value of ap¬ 
proximately 0.87. This limiting value turns out to be 
precisely the folding temperature that would be cal¬ 
culated using just the cube conformations. In table 
we see that for sequence 002 Tj = 1.763 for the cube- 
only calculation. In calculation A = 2 so we need 
to divide the temperature by 2, giving 0.88. The 
reason Tf approaches the cube-only value is that as 
|ifavg/A| increases the cube states decrease in energy 
more than the non-cube states since they have more 
contacts. Eventually, at low enough Eavg/A all the 
low energy states are just cubes. It is the low energy 
states that determine Tf. Once Tf reaches this limit, 
it is unchanged by further changes to Ea_.yg/A since 
the relative energy between the various cube states is 
determined by A which we are holding constant at 1. 

Looking at both the Tf and Tg lines we see there 
is a key point at which Tg becomes greater than 
Tf. At Aavg/A « 1.7 the chains become glassy be¬ 
fore they become thermodynamically stable; conse¬ 
quently, they will not be able to fold. For Aavg/A 
greater than 1.7, Tg > Tf. The chain is trapped 
in local minimum and will not find the native state 
within the simulation time. When Aavg/A is less than 
1.7 Tg < Tf, so the chains fold to the native state 
and will be thermodynamically stable before the dy¬ 
namics slow down. By including the results for the 
collapse transition temperature (the temperature at 
which half the chains have at least 20 contacts) a 
qualitative phase diagram can be drawn (see fig. [l^ ). 
There are four regions: random coil, collapsed glob¬ 
ule, folded and collapsed frozen state. The phase 
diagram is very similar to other lattice models and 
theoretical calculations of heteropolymers .^E3 The 
vertical dotted line represents the transition from the 
glassy to folded phase. To the right of it (i.e. large 
lAavg/Aj) Tf/Tg < 1 and to the left of the line (small 
lAavg/AI) Tf/Tg > 1. As |ifavg/A| is decreased the 
folding and collapse curves converge. This is the same 
behavior observed in the kinetic data (see fig II- As 
Aavg/A approaches zero, becomes positive. Col¬ 
lapsed states with weak contacts will be unfavored 
relative to states with no contacts. This drives the 
polymer to form only correct (good) contacts, so the 
chains will collapse almost directly to the correctly 
folded state. Another way to understand this is that 
non-native cube conformations are unfavored as Eu 
increases. At Aavg/A = 0 roughly half of the cubes 
for sequence 002 will have positive energies. This 
removes them as possible kinetic traps. For all val¬ 
ues of Favg/A, the collapse temperature is greater 





Figure 16: Phase diagram for sequence 002. There 
are four regions: random coil, collapsed globule, col¬ 
lapsed frozen, and folded. The solid line between the 
random coils and the collapsed globule state is the 
collapse transition, the temperature at which half the 
chains have 20 contacts. The second solid line is equal 
to either Tf or Tg, whichever is greater at that value 
of Favg/A. The dotted line equals the lesser of Tg or 
Tf. The vertical dash-dotted line shows the transi¬ 
tion from the folding region (where Tf > Tg) to the 
frozen region. It occurs at Favg/A —1.7. 


than not only the folding, but also the glass, tem¬ 
perature.. Previous heteropolymer mean-field calcu¬ 
lations^ show that Tg must be less than Tg, the col¬ 
lapse temperature, consistent with what we find here. 
Perhaps most interestingly, we see that by modulat¬ 
ing a single parameter in our model we obtain a range 
of qualitatively different folding behaviors. Recently, 
work has been done on the classification and exami¬ 
nation of the various possible folding regimes with a 
comparison to experimental data.Q 

IV CONCLUSIONS 

We have continued our comprehensive analysis of the 
27 monomer cube lattice heteropolymer. The Monte 
Carlo histogram method proved extremely useful, al¬ 
lowing us to determine the density of states for this 
system and then to calculate a broad range of ther¬ 
modynamic quantities. In particular, the method 
overcame the problem of dynamical slowing down 
at low temperatures. Like many other heteropoly¬ 
mer studies (both analytical and numerical) we find 
two different transitions: a collapse transition with a 
roughly sequence-independent collapse temperature 
and a folding (to the native state) transition with 
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a sequence-dependent folding temperature. The col¬ 
lapse transition has a second order-like behavior and 
the folding transition seems first order-like. The good 
folding sequences, i.e., the sequences that are sta¬ 
ble and have fast folding times, have sharper, more 
clearly defined transitions, as viewed from the tem¬ 
perature dependence of the average energy and the 
specific heat. Combining kinetic and thermodynamic 
data, we studied the unfolding behavior of the sys¬ 
tem. The unfolding rate has a much simpler tem¬ 
perature dependence than the folding rate. The rate 
varies roughly linearly with 1 /T for a broad range of 
temperature up to the glass point. 

After the density of states was obtained we were 
not only able to extrapolate to different tempera¬ 
tures but also to different parameter values of the 
energy function. The systems were examined as a 
function of the average drive toward compactness, 
where the average drive is the average of the two con¬ 
tact energies divided (normalized) by their difference 
(Aavg/A). There is a specific value for this energy 
drive at which the kinetic glass temperature became 
greater than the folding temperature, indicating that 
the system would no longer be able to fold within the 
simulation time due to trapping. We constructed a 
phase diagram as a function of this average drive and 
temperature (also normalized by the splitting). As 
the average energy drive is reduced, the two transi¬ 
tions, collapse and folding, converge. At zero-average 
drive the system collapses almost directly into the na¬ 
tive state. 

One criticism of these models is that they are too 
simple to represent real proteins. However, even in 
this simple model we see a broad and diverse range 
of behaviors depending on the parameters used. It 
seems likely that some of the behaviors of real pro¬ 
teins can be explained by some particular set of pa¬ 
rameters. More importantly, it may well be the case 
that different proteins have different folding behav¬ 
iors. Some proteins may fold extremely rapidly to the 
native state, literally collapsing into the native state, 
while other proteins may have a clear separation in 
time scales between collapse to a compact but non¬ 
native ensemble of structures and the rearrangement 
of the chain to the final native form. In our model we 
can interpolate between these two regimes by mod¬ 
ulating one parameter. In the future, more realis¬ 
tic models that are still simple enough for a through 
analysis may reveal more about the properties and 
functions of real proteins. 
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